AITopics | unstructured environment

Collaborating Authors

unstructured environment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-modal Loop Closure Detection with Foundation Models in Severely Unstructured Environments

Gonzalez, Laura Alejandra Encinar, Folkesson, John, Triebel, Rudolph, Giubilato, Riccardo

arXiv.org Artificial IntelligenceNov-10-2025

Robust loop closure detection is a critical component of Simultaneous Localization and Mapping (SLAM) algorithms in GNSS-denied environments, such as in the context of planetary exploration. In these settings, visual place recognition often fails due to aliasing and weak textures, while LiDAR-based methods suffer from sparsity and ambiguity. This paper presents MPRF, a multimodal pipeline that leverages transformer-based foundation models for both vision and LiDAR modalities to achieve robust loop closure in severely unstructured environments. Unlike prior work limited to retrieval, MPRF integrates a two-stage visual retrieval strategy with explicit 6-DoF pose estimation, combining DINOv2 features with SALAD aggregation for efficient candidate screening and SONATA-based LiDAR descriptors for geometric verification. Experiments on the S3LI dataset and S3LI Vulcano dataset show that MPRF outperforms state-of-the-art retrieval methods in precision while enhancing pose estimation robustness in low-texture regions. By providing interpretable correspondences suitable for SLAM back-ends, MPRF achieves a favorable trade-off between accuracy, efficiency, and reliability, demonstrating the potential of foundation models to unify place recognition and pose estimation. Code and models will be released at github.com/DLR-RM/MPRF.

artificial intelligence, multi-modal loop closure detection, unstructured environment, (1 more...)

arXiv.org Artificial Intelligence

2511.05404

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Heterogeneous Robot Collaboration in Unstructured Environments with Grounded Generative Intelligence

Ravichandran, Zachary, Cladera, Fernando, Prabhu, Ankit, Hughes, Jason, Murali, Varun, Taylor, Camillo, Pappas, George J., Kumar, Vijay

arXiv.org Artificial IntelligenceNov-3-2025

Heterogeneous robot teams operating in realistic settings often must accomplish complex missions requiring collaboration and adaptation to information acquired online. Because robot teams frequently operate in unstructured environments -- uncertain, open-world settings without prior maps -- subtasks must be grounded in robot capabilities and the physical world. While heterogeneous teams have typically been designed for fixed specifications, generative intelligence opens the possibility of teams that can accomplish a wide range of missions described in natural language. However, current large language model (LLM)-enabled teaming methods typically assume well-structured and known environments, limiting deployment in unstructured environments. We present SPINE-HT, a framework that addresses these limitations by grounding the reasoning abilities of LLMs in the context of a heterogeneous robot team through a three-stage process. Given language specifications describing mission goals and team capabilities, an LLM generates grounded subtasks which are validated for feasibility. Subtasks are then assigned to robots based on capabilities such as traversability or perception and refined given feedback collected during online operation. In simulation experiments with closed-loop perception and control, our framework achieves nearly twice the success rate compared to prior LLM-enabled heterogeneous teaming approaches. In real-world experiments with a Clearpath Jackal, a Clearpath Husky, a Boston Dynamics Spot, and a high-altitude UAV, our method achieves an 87\% success rate in missions requiring reasoning about robot capabilities and refining subtasks with online feedback. More information is provided at https://zacravichandran.github.io/SPINE-HT.

artificial intelligence, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.26915

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games (0.75)
Energy (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)

Add feedback

BiFlex: A Passive Bimodal Stiffness Flexible Wrist for Manipulation in Unstructured Environments

Jeong, Gu-Cheol, Gasperina, Stefano Dalla, Deshpande, Ashish D., Chin, Lillian, Martín-Martín, Roberto

arXiv.org Artificial IntelligenceOct-21-2025

-- Robotic manipulation in unstructured, human-centric environments poses a dual challenge: achieving the precision need for delicate free-space operation while ensuring safety during unexpected contact events. Traditional wrists struggle to balance these demands, often relying on complex control schemes or complicated mechanical designs to mitigate potential damage from force overload. In response, we present BiFlex, a flexible robotic wrist that uses a soft buckling honeycomb structure to provides a natural bimodal stiffness response. The higher stiffness mode enables precise household object manipulation, while the lower stiffness mode provides the compliance needed to adapt to external forces. We design BiFlex to maintain a fingertip deflection of less than 1 cm while supporting loads up to 500g and create a BiFlex wrist for many grippers, including Panda, Robotiq, and BaRiFlex. We demonstrate that BiFlex simplifies control while maintaining precise object manipulation and enhanced safety in real-world applications. Designing robots capable of physical tasks in unstructured environments remains one of the core open problems of modern robotics. Unstructured settings are characterized by their inherent uncertainty that exposes robotic end-effectors to frequent and unpredictable forces. For example, when grasping a flat object or wiping a surface, inaccuracies in the perceived location could lead to the robot missing the target, or creating unexpected and dangerously high reactive forces that could damage the robot.

artificial intelligence, biflex, stiffness, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2025.3605095

2504.08706

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Humanoid Agent via Embodied Chain-of-Action Reasoning with Multimodal Foundation Models for Zero-Shot Loco-Manipulation

Wen, Congcong, Bethala, Geeta Chandra Raju, Hao, Yu, Pudasaini, Niraj, Huang, Hao, Yuan, Shuaihang, Huang, Baoru, Nguyen, Anh, Wang, Mengyu, Tzes, Anthony, Fang, Yi

arXiv.org Artificial IntelligenceOct-7-2025

Humanoid loco-manipulation, which integrates whole-body locomotion with dexterous manipulation, remains a fundamental challenge in robotics. Beyond whole-body coordination and balance, a central difficulty lies in understanding human instructions and translating them into coherent sequences of embodied actions. Recent advances in foundation models provide transferable multimodal representations and reasoning capabilities, yet existing efforts remain largely restricted to either locomotion or manipulation in isolation, with limited applicability to humanoid settings. In this paper, we propose Humanoid-COA, the first humanoid agent framework that integrates foundation model reasoning with an Embodied Chain-of-Action (CoA) mechanism for zero-shot loco-manipulation. Within the perception--reasoning--action paradigm, our key contribution lies in the reasoning stage, where the proposed CoA mechanism decomposes high-level human instructions into structured sequences of locomotion and manipulation primitives through affordance analysis, spatial inference, and whole-body action reasoning. Extensive experiments on two humanoid robots, Unitree H1-2 and G1, in both an open test area and an apartment environment, demonstrate that our framework substantially outperforms prior baselines across manipulation, locomotion, and loco-manipulation tasks, achieving robust generalization to long-horizon and unstructured scenarios. Project page: https://humanoid-coa.github.io/

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2504.09532

Country: Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Transferring Vision-Language-Action Models to Industry Applications: Architectures, Performance, and Challenges

Li, Shuai, Yizhe, Chen, Dong, Li, Sichao, Liu, Dapeng, Lan, Yu, Liu, Pang, Zhibo

arXiv.org Artificial IntelligenceSep-30-2025

The application of artificial intelligence (AI) in industry is accelerating the shift from traditional automation to intelligent systems with perception and cognition. Vision language-action (VLA) models have been a key paradigm in AI to unify perception, reasoning, and control. Has the performance of the VLA models met the industrial requirements? In this paper, from the perspective of industrial deployment, we compare the performance of existing state-of-the-art VLA models in industrial scenarios and analyze the limitations of VLA models for real-world industrial deployment from the perspectives of data collection and model architecture. The results show that the VLA models retain their ability to perform simple grasping tasks even in industrial settings after fine-tuning. However, there is much room for performance improvement in complex industrial environments, diverse object categories, and high precision placing tasks. Our findings provide practical insight into the adaptability of VLA models for industrial use and highlight the need for task-specific enhancements to improve their robustness, generalization, and precision.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.23121

Country: Asia > China (0.48)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)

Add feedback

Online Adaptation of Terrain-Aware Dynamics for Planning in Unstructured Environments

Ward, William, Etter, Sarah, Ingebrand, Tyler, Ellis, Christian, Thorpe, Adam J., Topcu, Ufuk

arXiv.org Artificial IntelligenceJul-18-2025

--Autonomous mobile robots operating in remote, unstructured environments must adapt to new, unpredictable terrains that can change rapidly during operation. In such scenarios, a critical challenge becomes estimating the robot's dynamics on changing terrain in order to enable reliable, accurate navigation and planning. We present a novel online adaptation approach for terrain-aware dynamics modeling and planning using function encoders. By learning a set of neural network basis functions that span the robot dynamics on diverse terrains, we enable rapid online adaptation to new, unseen terrains and environments as a simple least-squares calculation. We demonstrate our approach for terrain adaptation in a Unity-based robotics simulator and show that the downstream controller has better empirical performance due to higher accuracy of the learned model. This leads to fewer collisions with obstacles while navigating in cluttered environments as compared to a neural ODE baseline. Rapid adaptation to unknown environments and terrain is critical for autonomous mobile robots. In off-road navigation, unpredictable terrain features such as rocky paths, forest floors, and wet fields can cause skidding, tripping, or immobilization, jeopardizing the robot's ability to reach its objective. Autonomous ground vehicles must therefore dynamically adjust their behavior to terrain-specific conditions. This adaptation is challenging because terrain variations directly alter system dynamics. For example, tire response to acceleration depends on surface friction.

artificial intelligence, machine learning, terrain, (18 more...)

arXiv.org Artificial Intelligence

2506.04484

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Energy (0.94)
Automobiles & Trucks (0.93)
Government > Regional Government > North America Government > United States Government (0.68)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Add feedback

LOVON: Legged Open-Vocabulary Object Navigator

Peng, Daojie, Cao, Jiahang, Zhang, Qiang, Ma, Jun

arXiv.org Artificial IntelligenceJul-10-2025

--Object navigation in open-world environments remains a formidable and pervasive challenge for robotic systems, particularly when it comes to executing long-horizon tasks that require both open-world object detection and high-level task planning. Traditional methods often struggle to integrate these components effectively, and this limits their capability to deal with complex, long-range navigation missions. In this paper, we propose LOVON, a novel framework that integrates large language models (LLMs) for hierarchical task planning with open-vocabulary visual detection models, tailored for effective long-range object navigation in dynamic, unstructured environments. T o tackle real-world challenges including visual jittering, blind zones, and temporary target loss, we design dedicated solutions such as Laplacian V ariance Filtering for visual stabilization. We also develop a functional execution logic for the robot that guarantees LOVON's capabilities in autonomous navigation, task adaptation, and robust task completion. Extensive evaluations demonstrate the successful completion of long-sequence tasks involving real-time detection, search, and navigation toward open-vocabulary dynamic targets. In recent years, large language models (LLMs) [1] and vision models [2]-[5] have achieved revolutionary breakthroughs in the field of artificial intelligence.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.06747

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Gaussian Splatting as a Unified Representation for Autonomy in Unstructured Environments

Ong, Dexter, Tao, Yuezhan, Murali, Varun, Spasojevic, Igor, Kumar, Vijay, Chaudhari, Pratik

arXiv.org Artificial IntelligenceMay-20-2025

--In this work, we argue that Gaussian splatting is a suitable unified representation for autonomous robot navigation in large-scale unstructured outdoor environments. Such environments require representations that can capture complex structures while remaining computationally tractable for real-time navigation. We demonstrate that the dense geometric and photometric information provided by a Gaussian splatting representation is useful for navigation in unstructured environments. Additionally, semantic information can be embedded in the Gaussian map to enable large-scale task-driven navigation. From the lessons learned through our experiments, we highlight several challenges and opportunities arising from the use of such a representation for robot autonomy. In environments such as those in Figure 1, traditional approaches often struggle to capture the complexity and variability of the scene, presenting challenges for autonomous navigation under such conditions. These capabilities are crucial for applications such as precision agriculture [1], forestry [2], search-and-rescue [3] and infrastructure inspection [4]. To address this, we present Gaussian splatting as a versatile representation for large-scale autonomy in unstructured outdoor environments.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2505.11794

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Pennsylvania (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report (0.64)

Industry: Food & Agriculture > Agriculture (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities

Ravichandran, Zachary, Cladera, Fernando, Hughes, Jason, Murali, Varun, Hsieh, M. Ani, Pappas, George J., Taylor, Camillo J., Kumar, Vijay

arXiv.org Artificial IntelligenceMay-15-2025

The integration of foundation models (FMs) into robotics has enabled robots to understand natural language and reason about the semantics in their environments. However, existing FM-enabled robots primary operate in closed-world settings, where the robot is given a full prior map or has a full view of its workspace. This paper addresses the deployment of FM-enabled robots in the field, where missions often require a robot to operate in large-scale and unstructured environments. To effectively accomplish these missions, robots must actively explore their environments, navigate obstacle-cluttered terrain, handle unexpected sensor inputs, and operate with compute constraints. We discuss recent deployments of SPINE, our LLM-enabled autonomy framework, in field robotic settings. To the best of our knowledge, we present the first demonstration of large-scale LLM-enabled robot planning in unstructured environments with several kilometers of missions. SPINE is agnostic to a particular LLM, which allows us to distill small language models capable of running onboard size, weight and power (SWaP) limited platforms. Via preliminary model distillation work, we then present the first language-driven UAV planner using on-device language models. We conclude our paper by proposing several promising directions for future research.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.09477

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Government > Military (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

DRPA-MPPI: Dynamic Repulsive Potential Augmented MPPI for Reactive Navigation in Unstructured Environments

Fuke, Takahiro, Endo, Masafumi, Honda, Kohei, Ishigami, Genya

arXiv.org Artificial IntelligenceMar-25-2025

Reactive mobile robot navigation in unstructured environments is challenging when robots encounter unexpected obstacles that invalidate previously planned trajectories. Model predictive path integral control (MPPI) enables reactive planning, but still suffers from limited prediction horizons that lead to local minima traps near obstacles. Current solutions rely on heuristic cost design or scenario-specific pre-training, which often limits their adaptability to new environments. We introduce dynamic repulsive potential augmented MPPI (DRPA-MPPI), which dynamically detects potential entrapments on the predicted trajectories. Upon detecting local minima, DRPA-MPPI automatically switches between standard goal-oriented optimization and a modified cost function that generates repulsive forces away from local minima. Comprehensive testing in simulated obstacle-rich environments confirms DRPA-MPPI's superior navigation performance and safety compared to conventional methods with less computational burden.

artificial intelligence, local minima, obstacle, (17 more...)

arXiv.org Artificial Intelligence

2503.20134

Country:

Asia > Japan (0.04)
Africa > Togo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.34)

Add feedback